{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2f43c205",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/objects/object_index.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f062ba1-0049-472c-8300-64f83d405ffc",
   "metadata": {},
   "source": [
    "# The `ObjectIndex` Class\n",
    "\n",
    "The `ObjectIndex` class is one that allows for the indexing of arbitrary Python objects. As such, it is quite flexible and applicable to a wide-range of use cases. As examples:\n",
    "- [Use an `ObjectIndex` to index Tool objects to then be used by an agent.](https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent_retrieval/#building-an-object-index)\n",
    "- [Use an `ObjectIndex` to index a SQLTableSchema objects](https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/sqlindexdemo/#part-2-query-time-retrieval-of-tables-for-text-to-sql)\n",
    "\n",
    "To construct an `ObjectIndex`, we require an index as well as another abstraction, namely `ObjectNodeMapping`. This mapping, as its name suggests, provides the means to go between node and the associated object, and vice versa. Alternatively, there exists a `from_objects()` class method, that can conveniently construct an `ObjectIndex` from a set of objects.\n",
    "\n",
    "In this notebook, we'll quickly cover how you can build an `ObjectIndex` using a `SimpleObjectNodeMapping`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56660700",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Settings\n",
    "\n",
    "Settings.embed_model = \"local\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b46bc95b-e154-48e8-9475-350d2446e297",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex\n",
    "from llama_index.core.objects import ObjectIndex, SimpleObjectNodeMapping\n",
    "\n",
    "# some really arbitrary objects\n",
    "obj1 = {\"input\": \"Hey, how's it going\"}\n",
    "obj2 = [\"a\", \"b\", \"c\", \"d\"]\n",
    "obj3 = \"llamaindex is an awesome library!\"\n",
    "arbitrary_objects = [obj1, obj2, obj3]\n",
    "\n",
    "# (optional) object-node mapping\n",
    "obj_node_mapping = SimpleObjectNodeMapping.from_objects(arbitrary_objects)\n",
    "nodes = obj_node_mapping.to_nodes(arbitrary_objects)\n",
    "\n",
    "# object index\n",
    "object_index = ObjectIndex(\n",
    "    index=VectorStoreIndex(nodes=nodes),\n",
    "    object_node_mapping=obj_node_mapping,\n",
    ")\n",
    "\n",
    "# object index from_objects (default index_cls=VectorStoreIndex)\n",
    "object_index = ObjectIndex.from_objects(\n",
    "    arbitrary_objects, index_cls=VectorStoreIndex\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bd9b90c3-4e72-46b9-b545-a816d4b9ed75",
   "metadata": {},
   "source": [
    "### As a retriever\n",
    "With the `object_index` in hand, we can use it as a retriever, to retrieve against the index objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c8ed16df-5ea3-47bf-81a9-d9917c31d48f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['llamaindex is an awesome library!']"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "object_retriever = object_index.as_retriever(similarity_top_k=1)\n",
    "object_retriever.retrieve(\"llamaindex\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db53c825",
   "metadata": {},
   "source": [
    "We can also add node-postprocessors to an object index retriever, for easy convience to things like rerankers and more."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "df68a0d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-postprocessor-colbert-rerank"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f1fd4b6e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['llamaindex is an awesome library!']"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from llama_index.postprocessor.colbert_rerank import ColbertRerank\n",
    "\n",
    "retriever = object_index.as_retriever(\n",
    "    similarity_top_k=2, node_postprocessors=[ColbertRerank(top_n=1)]\n",
    ")\n",
    "retriever.retrieve(\"a random list object\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3c5d574c",
   "metadata": {},
   "source": [
    "## Using a Storage Integration (i.e. Chroma)\n",
    "\n",
    "The object index supports integrations with any existing storage backend in LlamaIndex.\n",
    "\n",
    "The following section walks through how to set that up using `Chroma` as an example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "86a6eb5e",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-vector-stores-chroma"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3fa5b3f",
   "metadata": {},
   "outputs": [
    {
     "ename": "FileNotFoundError",
     "evalue": "[Errno 2] No such file or directory: './chroma_db2'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mFileNotFoundError\u001b[0m                         Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[31], line 5\u001b[0m\n\u001b[1;32m      2\u001b[0m \u001b[39mfrom\u001b[39;00m \u001b[39mllama_index\u001b[39;00m\u001b[39m.\u001b[39;00m\u001b[39mvector_stores\u001b[39;00m\u001b[39m.\u001b[39;00m\u001b[39mchroma\u001b[39;00m \u001b[39mimport\u001b[39;00m ChromaVectorStore\n\u001b[1;32m      3\u001b[0m \u001b[39mimport\u001b[39;00m \u001b[39mchromadb\u001b[39;00m\n\u001b[0;32m----> 5\u001b[0m db \u001b[39m=\u001b[39m chromadb\u001b[39m.\u001b[39;49mPersistentClient(path\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39m./chroma_db2\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n\u001b[1;32m      6\u001b[0m chroma_collection \u001b[39m=\u001b[39m db\u001b[39m.\u001b[39mget_or_create_collection(\u001b[39m\"\u001b[39m\u001b[39mquickstart2\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[1;32m      7\u001b[0m vector_store \u001b[39m=\u001b[39m ChromaVectorStore(chroma_collection\u001b[39m=\u001b[39mchroma_collection)\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/__init__.py:146\u001b[0m, in \u001b[0;36mPersistentClient\u001b[0;34m(path, settings, tenant, database)\u001b[0m\n\u001b[1;32m    143\u001b[0m tenant \u001b[39m=\u001b[39m \u001b[39mstr\u001b[39m(tenant)\n\u001b[1;32m    144\u001b[0m database \u001b[39m=\u001b[39m \u001b[39mstr\u001b[39m(database)\n\u001b[0;32m--> 146\u001b[0m \u001b[39mreturn\u001b[39;00m ClientCreator(tenant\u001b[39m=\u001b[39;49mtenant, database\u001b[39m=\u001b[39;49mdatabase, settings\u001b[39m=\u001b[39;49msettings)\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/api/client.py:139\u001b[0m, in \u001b[0;36mClient.__init__\u001b[0;34m(self, tenant, database, settings)\u001b[0m\n\u001b[1;32m    133\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39m__init__\u001b[39m(\n\u001b[1;32m    134\u001b[0m     \u001b[39mself\u001b[39m,\n\u001b[1;32m    135\u001b[0m     tenant: \u001b[39mstr\u001b[39m \u001b[39m=\u001b[39m DEFAULT_TENANT,\n\u001b[1;32m    136\u001b[0m     database: \u001b[39mstr\u001b[39m \u001b[39m=\u001b[39m DEFAULT_DATABASE,\n\u001b[1;32m    137\u001b[0m     settings: Settings \u001b[39m=\u001b[39m Settings(),\n\u001b[1;32m    138\u001b[0m ) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m \u001b[39mNone\u001b[39;00m:\n\u001b[0;32m--> 139\u001b[0m     \u001b[39msuper\u001b[39;49m()\u001b[39m.\u001b[39;49m\u001b[39m__init__\u001b[39;49m(settings\u001b[39m=\u001b[39;49msettings)\n\u001b[1;32m    140\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mtenant \u001b[39m=\u001b[39m tenant\n\u001b[1;32m    141\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdatabase \u001b[39m=\u001b[39m database\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/api/client.py:43\u001b[0m, in \u001b[0;36mSharedSystemClient.__init__\u001b[0;34m(self, settings)\u001b[0m\n\u001b[1;32m     38\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39m__init__\u001b[39m(\n\u001b[1;32m     39\u001b[0m     \u001b[39mself\u001b[39m,\n\u001b[1;32m     40\u001b[0m     settings: Settings \u001b[39m=\u001b[39m Settings(),\n\u001b[1;32m     41\u001b[0m ) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m \u001b[39mNone\u001b[39;00m:\n\u001b[1;32m     42\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_identifier \u001b[39m=\u001b[39m SharedSystemClient\u001b[39m.\u001b[39m_get_identifier_from_settings(settings)\n\u001b[0;32m---> 43\u001b[0m     SharedSystemClient\u001b[39m.\u001b[39;49m_create_system_if_not_exists(\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_identifier, settings)\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/api/client.py:54\u001b[0m, in \u001b[0;36mSharedSystemClient._create_system_if_not_exists\u001b[0;34m(cls, identifier, settings)\u001b[0m\n\u001b[1;32m     51\u001b[0m     \u001b[39mcls\u001b[39m\u001b[39m.\u001b[39m_identifer_to_system[identifier] \u001b[39m=\u001b[39m new_system\n\u001b[1;32m     53\u001b[0m     new_system\u001b[39m.\u001b[39minstance(ProductTelemetryClient)\n\u001b[0;32m---> 54\u001b[0m     new_system\u001b[39m.\u001b[39;49minstance(ServerAPI)\n\u001b[1;32m     56\u001b[0m     new_system\u001b[39m.\u001b[39mstart()\n\u001b[1;32m     57\u001b[0m \u001b[39melse\u001b[39;00m:\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/config.py:382\u001b[0m, in \u001b[0;36mSystem.instance\u001b[0;34m(self, type)\u001b[0m\n\u001b[1;32m    379\u001b[0m     \u001b[39mtype\u001b[39m \u001b[39m=\u001b[39m get_class(fqn, \u001b[39mtype\u001b[39m)\n\u001b[1;32m    381\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mtype\u001b[39m \u001b[39mnot\u001b[39;00m \u001b[39min\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_instances:\n\u001b[0;32m--> 382\u001b[0m     impl \u001b[39m=\u001b[39m \u001b[39mtype\u001b[39;49m(\u001b[39mself\u001b[39;49m)\n\u001b[1;32m    383\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_instances[\u001b[39mtype\u001b[39m] \u001b[39m=\u001b[39m impl\n\u001b[1;32m    384\u001b[0m     \u001b[39mif\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_running:\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/api/segment.py:102\u001b[0m, in \u001b[0;36mSegmentAPI.__init__\u001b[0;34m(self, system)\u001b[0m\n\u001b[1;32m    100\u001b[0m \u001b[39msuper\u001b[39m()\u001b[39m.\u001b[39m\u001b[39m__init__\u001b[39m(system)\n\u001b[1;32m    101\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_settings \u001b[39m=\u001b[39m system\u001b[39m.\u001b[39msettings\n\u001b[0;32m--> 102\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_sysdb \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mrequire(SysDB)\n\u001b[1;32m    103\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_manager \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mrequire(SegmentManager)\n\u001b[1;32m    104\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_quota \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mrequire(QuotaEnforcer)\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/config.py:281\u001b[0m, in \u001b[0;36mComponent.require\u001b[0;34m(self, type)\u001b[0m\n\u001b[1;32m    278\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mrequire\u001b[39m(\u001b[39mself\u001b[39m, \u001b[39mtype\u001b[39m: Type[T]) \u001b[39m-\u001b[39m\u001b[39m>\u001b[39m T:\n\u001b[1;32m    279\u001b[0m \u001b[39m    \u001b[39m\u001b[39m\"\"\"Get a Component instance of the given type, and register as a dependency of\u001b[39;00m\n\u001b[1;32m    280\u001b[0m \u001b[39m    that instance.\"\"\"\u001b[39;00m\n\u001b[0;32m--> 281\u001b[0m     inst \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_system\u001b[39m.\u001b[39;49minstance(\u001b[39mtype\u001b[39;49m)\n\u001b[1;32m    282\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_dependencies\u001b[39m.\u001b[39madd(inst)\n\u001b[1;32m    283\u001b[0m     \u001b[39mreturn\u001b[39;00m inst\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/config.py:382\u001b[0m, in \u001b[0;36mSystem.instance\u001b[0;34m(self, type)\u001b[0m\n\u001b[1;32m    379\u001b[0m     \u001b[39mtype\u001b[39m \u001b[39m=\u001b[39m get_class(fqn, \u001b[39mtype\u001b[39m)\n\u001b[1;32m    381\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mtype\u001b[39m \u001b[39mnot\u001b[39;00m \u001b[39min\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_instances:\n\u001b[0;32m--> 382\u001b[0m     impl \u001b[39m=\u001b[39m \u001b[39mtype\u001b[39;49m(\u001b[39mself\u001b[39;49m)\n\u001b[1;32m    383\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_instances[\u001b[39mtype\u001b[39m] \u001b[39m=\u001b[39m impl\n\u001b[1;32m    384\u001b[0m     \u001b[39mif\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_running:\n",
      "File \u001b[0;32m~/giant_change/llama_index/venv/lib/python3.10/site-packages/chromadb/db/impl/sqlite.py:88\u001b[0m, in \u001b[0;36mSqliteDB.__init__\u001b[0;34m(self, system)\u001b[0m\n\u001b[1;32m     84\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_db_file \u001b[39m=\u001b[39m (\n\u001b[1;32m     85\u001b[0m         \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_settings\u001b[39m.\u001b[39mrequire(\u001b[39m\"\u001b[39m\u001b[39mpersist_directory\u001b[39m\u001b[39m\"\u001b[39m) \u001b[39m+\u001b[39m \u001b[39m\"\u001b[39m\u001b[39m/chroma.sqlite3\u001b[39m\u001b[39m\"\u001b[39m\n\u001b[1;32m     86\u001b[0m     )\n\u001b[1;32m     87\u001b[0m     \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m os\u001b[39m.\u001b[39mpath\u001b[39m.\u001b[39mexists(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_db_file):\n\u001b[0;32m---> 88\u001b[0m         os\u001b[39m.\u001b[39;49mmakedirs(os\u001b[39m.\u001b[39;49mpath\u001b[39m.\u001b[39;49mdirname(\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_db_file), exist_ok\u001b[39m=\u001b[39;49m\u001b[39mTrue\u001b[39;49;00m)\n\u001b[1;32m     89\u001b[0m     \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_conn_pool \u001b[39m=\u001b[39m PerThreadPool(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_db_file)\n\u001b[1;32m     90\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_tx_stack \u001b[39m=\u001b[39m local()\n",
      "File \u001b[0;32m~/miniforge3/lib/python3.10/os.py:225\u001b[0m, in \u001b[0;36mmakedirs\u001b[0;34m(name, mode, exist_ok)\u001b[0m\n\u001b[1;32m    223\u001b[0m         \u001b[39mreturn\u001b[39;00m\n\u001b[1;32m    224\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[0;32m--> 225\u001b[0m     mkdir(name, mode)\n\u001b[1;32m    226\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mOSError\u001b[39;00m:\n\u001b[1;32m    227\u001b[0m     \u001b[39m# Cannot rely on checking for EEXIST, since the operating system\u001b[39;00m\n\u001b[1;32m    228\u001b[0m     \u001b[39m# could give priority to other errors like EACCES or EROFS\u001b[39;00m\n\u001b[1;32m    229\u001b[0m     \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m exist_ok \u001b[39mor\u001b[39;00m \u001b[39mnot\u001b[39;00m path\u001b[39m.\u001b[39misdir(name):\n",
      "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: './chroma_db2'"
     ]
    }
   ],
   "source": [
    "from llama_index.core import StorageContext, VectorStoreIndex\n",
    "from llama_index.vector_stores.chroma import ChromaVectorStore\n",
    "import chromadb\n",
    "\n",
    "db = chromadb.PersistentClient(path=\"./chroma_db\")\n",
    "chroma_collection = db.get_or_create_collection(\"quickstart2\")\n",
    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)\n",
    "storage_context = StorageContext.from_defaults(vector_store=vector_store)\n",
    "\n",
    "object_index = ObjectIndex.from_objects(\n",
    "    arbitrary_objects,\n",
    "    index_cls=VectorStoreIndex,\n",
    "    storage_context=storage_context,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28cda697",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['llamaindex is an awesome library!']"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "object_retriever = object_index.as_retriever(similarity_top_k=1)\n",
    "object_retriever.retrieve(\"llamaindex\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "358994af",
   "metadata": {},
   "source": [
    "Now, lets \"reload\" the index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "61134380",
   "metadata": {},
   "outputs": [],
   "source": [
    "db = chromadb.PersistentClient(path=\"./chroma_db\")\n",
    "chroma_collection = db.get_or_create_collection(\"quickstart\")\n",
    "vector_store = ChromaVectorStore(chroma_collection=chroma_collection)\n",
    "\n",
    "index = VectorStoreIndex.from_vector_store(vector_store=vector_store)\n",
    "\n",
    "object_index = ObjectIndex.from_objects_and_index(arbitrary_objects, index)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "644a32d0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['llamaindex is an awesome library!']"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "object_retriever = object_index.as_retriever(similarity_top_k=1)\n",
    "object_retriever.retrieve(\"llamaindex\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78bfe419",
   "metadata": {},
   "source": [
    "Note that when we reload the index, we still have to pass the objects, since those are not saved in the actual index/vector db."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a578c2c9",
   "metadata": {},
   "source": [
    "## [Advanced] Customizing the Mapping\n",
    "\n",
    "For specialized cases where you want full control over how objects are mapped to nodes, you can also provide a `to_node_fn()` and `from_node_fn()` hook.\n",
    "\n",
    "This is useful for when you are converting specialized objects, or want to dynamically create objects at runtime rather than keeping them in memory.\n",
    "\n",
    "A small example is shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1cdbd6c9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['llamaindex is an awesome library!']"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from llama_index.core.schema import TextNode\n",
    "\n",
    "my_objects = {\n",
    "    str(hash(str(obj))): obj for i, obj in enumerate(arbitrary_objects)\n",
    "}\n",
    "\n",
    "\n",
    "def from_node_fn(node):\n",
    "    return my_objects[node.id]\n",
    "\n",
    "\n",
    "def to_node_fn(obj):\n",
    "    return TextNode(id=str(hash(str(obj))), text=str(obj))\n",
    "\n",
    "\n",
    "object_index = ObjectIndex.from_objects(\n",
    "    arbitrary_objects,\n",
    "    index_cls=VectorStoreIndex,\n",
    "    from_node_fn=from_node_fn,\n",
    "    to_node_fn=to_node_fn,\n",
    ")\n",
    "\n",
    "object_retriever = object_index.as_retriever(similarity_top_k=1)\n",
    "\n",
    "object_retriever.retrieve(\"llamaindex\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0032d0e7-815d-414d-9fcc-384709b59484",
   "metadata": {},
   "source": [
    "## Persisting `ObjectIndex` to Disk with Objects\n",
    "\n",
    "When it comes to persisting the `ObjectIndex`, we have to handle both the index as well as the object-node mapping. Persisting the index is straightforward and can be handled by usual means (e.g., see this [guide](https://docs.llamaindex.ai/en/stable/module_guides/storing/save_load.html#persisting-loading-data)). However, it's a bit of a different story when it comes to persisting the `ObjectNodeMapping`. Since we're indexing aribtrary Python objects with the `ObjectIndex`, it may be the case (and perhaps more often than we'd like), that the arbitrary objects are not serializable. In those cases, you can persist the index, but the user would have to maintain a way to re-construct the `ObjectNodeMapping` to be able to re-construct the `ObjectIndex`. For convenience, there are the `persist` and `from_persist_dir` methods on the `ObjectIndex` that will attempt to persist and load a previously saved `ObjectIndex`, respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10053dbc-e9c2-4a54-9b04-a7c66af80860",
   "metadata": {},
   "source": [
    "### Happy example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ad58419-35b5-4010-ae3f-42f96e6c7226",
   "metadata": {},
   "outputs": [],
   "source": [
    "# persist to disk (no path provided will persist to the default path ./storage)\n",
    "object_index.persist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3aedfc02-a94b-4f0a-8e8b-a22bf76515fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "# re-loading (no path provided will attempt to load from the default path ./storage)\n",
    "reloaded_object_index = ObjectIndex.from_persist_dir()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ec0c2ba-ef26-41fb-b2ae-df946abac01c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{7981070310142320670: {'input': \"Hey, how's it going\"},\n",
       " -5984737625581842527: ['a', 'b', 'c', 'd'],\n",
       " -8305186196625446821: 'llamaindex is an awesome library!'}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reloaded_object_index._object_node_mapping.obj_node_mapping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34107c44-d51c-42a9-a802-fba67deba8d0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{7981070310142320670: {'input': \"Hey, how's it going\"},\n",
       " -5984737625581842527: ['a', 'b', 'c', 'd'],\n",
       " -8305186196625446821: 'llamaindex is an awesome library!'}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "object_index._object_node_mapping.obj_node_mapping"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8078b5ff-0047-4a0c-96ea-a9a768e060ae",
   "metadata": {},
   "source": [
    "### Example of when it doesn't work"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7b19b0c3-3bfb-4ed8-bcf0-d04615afbbd5",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.tools import FunctionTool\n",
    "from llama_index.core import SummaryIndex\n",
    "from llama_index.core.objects import SimpleToolNodeMapping\n",
    "\n",
    "\n",
    "def add(a: int, b: int) -> int:\n",
    "    \"\"\"Add two integers and returns the result integer\"\"\"\n",
    "    return a + b\n",
    "\n",
    "\n",
    "def multiply(a: int, b: int) -> int:\n",
    "    \"\"\"Multiple two integers and returns the result integer\"\"\"\n",
    "    return a * b\n",
    "\n",
    "\n",
    "multiply_tool = FunctionTool.from_defaults(fn=multiply)\n",
    "add_tool = FunctionTool.from_defaults(fn=add)\n",
    "\n",
    "object_mapping = SimpleToolNodeMapping.from_objects([add_tool, multiply_tool])\n",
    "object_index = ObjectIndex.from_objects(\n",
    "    [add_tool, multiply_tool], object_mapping\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "63c777a2-d73b-4916-96c4-794fe5ebcac5",
   "metadata": {},
   "outputs": [
    {
     "ename": "NotImplementedError",
     "evalue": "Subclasses should implement this!",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNotImplementedError\u001b[0m                       Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[4], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;66;03m# trying to persist the object_mapping directly will raise an error\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m \u001b[43mobject_mapping\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpersist\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/Projects/llama_index/llama_index/objects/tool_node_mapping.py:47\u001b[0m, in \u001b[0;36mBaseToolNodeMapping.persist\u001b[0;34m(self, persist_dir, obj_node_mapping_fname)\u001b[0m\n\u001b[1;32m     43\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mpersist\u001b[39m(\n\u001b[1;32m     44\u001b[0m     \u001b[38;5;28mself\u001b[39m, persist_dir: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m, obj_node_mapping_fname: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m\n\u001b[1;32m     45\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m     46\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"Persist objs.\"\"\"\u001b[39;00m\n\u001b[0;32m---> 47\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mNotImplementedError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mSubclasses should implement this!\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
      "\u001b[0;31mNotImplementedError\u001b[0m: Subclasses should implement this!"
     ]
    }
   ],
   "source": [
    "# trying to persist the object_mapping directly will raise an error\n",
    "object_mapping.persist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de77bbca-9ba8-46f9-a66d-60cf1ce143ad",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/var/folders/0g/wd11bmkd791fz7hvgy1kqyp00000gn/T/ipykernel_77363/46708458.py:2: UserWarning: Unable to persist ObjectNodeMapping. You will need to reconstruct the same object node mapping to build this ObjectIndex\n",
      "  object_index.persist()\n"
     ]
    }
   ],
   "source": [
    "# try to persist the object index here will throw a Warning to the user\n",
    "object_index.persist()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee002c84-fe00-43e5-b0cb-53f6fb547b13",
   "metadata": {},
   "source": [
    "**In this case, only the index has been persisted.** In order to re-construct the `ObjectIndex` as mentioned above, we will need to manually re-construct `ObjectNodeMapping` and supply that to the `ObjectIndex.from_persist_dir` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cbc69da5-d2f0-4feb-9527-436cd0d0a54b",
   "metadata": {},
   "outputs": [],
   "source": [
    "reloaded_object_index = ObjectIndex.from_persist_dir(\n",
    "    object_node_mapping=object_mapping  # without this, an error will be thrown\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
